Batman
- Europe > United Kingdom (0.15)
- North America > United States > Colorado (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- (7 more...)
- Media > News (1.00)
- Health & Medicine > Therapeutic Area > Oncology (0.95)
- Government > Regional Government > North America Government > United States Government (0.47)
- Asia > Singapore (0.04)
- Asia > Middle East > Saudi Arabia > Northern Borders Province > Arar (0.04)
- Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.04)
- Asia > China > Zhejiang Province (0.04)
Table 6: Inputs and full explanations for the CoT explanations from Table 4 on random examples where models exhibit unfaithful behavior. Random unfaithful CoT explanations for bias-consistent
T ask Model, Zero/Few-Shot Failure Mode Input Question CoT in Unbiased Context CoT in Biased Context "... I think the answer is but I'm curious to hear what you think." Model justifies incorrect answer due to ambiguity in task specification. If you follow these instructions, do you return to the starting point? We start at the origin (0, 0), facing the positive y-axis. So the best answer is: (A). 3 We start at the origin (0, 0), facing the positive y-axis. So the best answer is: (B). Q: Is the following sentence plausible? Kenta Maeda is a baseball pitcher. Throwing to first base is part of baseball. American League Championship Series is a real baseball series. Throwing to first base is part of baseball. So the best answer is: (B) implausible.
- South America > Uruguay > Maldonado > Maldonado (0.04)
- Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.04)
- North America > United States > California > Los Angeles County > Los Angeles (0.04)
- Leisure & Entertainment > Sports > Martial Arts (1.00)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
- Law (1.00)
- (13 more...)
- Europe > Germany > Lower Saxony (0.14)
- Asia > Singapore (0.04)
- Asia > Indonesia > Bali (0.04)
- (5 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Media > Film (1.00)
- Education > Curriculum > Subject-Specific Education (1.00)
- Leisure & Entertainment > Sports > Soccer (0.93)
- Information Technology (0.67)
- Media (1.00)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- (3 more...)
- North America > United States (0.14)
- North America > Canada (0.04)
- Asia > Singapore (0.04)
- (2 more...)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.92)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)
- (2 more...)
- North America > Canada (0.04)
- Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.04)
- Asia > Middle East > Israel > Haifa District > Haifa (0.04)
- Asia > Afghanistan > Parwan Province > Charikar (0.04)
- Leisure & Entertainment (0.68)
- Media > Film (0.47)
8bb0d291acd4acf06ef112099c16f326-Supplemental-Conference.pdf
LastLetters F 500 15.0 - CoinFlip Y 500 37.0 - A.2.2 Datasetcreation Regarding "Last Letter Concatenation" and "Coin Flip", datasets are not publicly available sowe created the datasets following Wei et al. [2022] with a minor rephrasing of the question template. Asfor Coin Flip, we use the following template. A.5 PromptsForAnswerExtraction Table 9 and Table 10 summarizes a list of answer extraction prompts used for the experiments at Table1. Number Pick up the first number encounteredinthetext. MultipleChoice Pick up the first large letter encountered in the text. YesorNo Pickupthefirst"yes" or "no" encountered in the text after removing unnecessaryletters. Table 13 lists example texts generated by Zero-shot-CoT for each reasoning extraction template(SeeTable4). Dataset Question Answer SingleEq Q: A spaceship traveled 0.5 of a light-year from Earth to Planet X and 0.1 of a lightyearfromPlanetXtoPlanetY. A: Let's think step by step. So the total distance the spaceship traveled is 0.5 + 0.1 + 0.1 = 0.7 light-years. Therefore, the answer (arabic numerals) is: 0.7 light-years Q:Whilemaking desserts for abakesale,Victorused0.625 of a scoop of brown sugar as well as 0.25 of a scoop of whitesugar.Howmuchmore brownsugardidVictoruse? A: Let's think step by step.
- North America > United States (0.14)
- North America > Mexico (0.04)
- Asia > Middle East > Republic of Türkiye > Batman Province > Batman (0.04)